-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Java: Treat x.matches(regexp) as a sanitizer for request forgery
#20688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java: Treat x.matches(regexp) as a sanitizer for request forgery
#20688
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR treats calls to String.matches(regexp) as sanitizers for server-side request forgery (SSRF) vulnerabilities in Java code. The implementation assumes that any regex pattern used with .matches() provides adequate sanitization without validating the actual regex pattern.
Key Changes:
- Added a new sanitizer class
MatchesSanitizerthat treats any call to amatches()method as a barrier against request forgery - Added test cases demonstrating both inline and extracted validation patterns using
.matches() - Added a change note documenting this new behavior
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| java/ql/lib/semmle/code/java/security/RequestForgery.qll | Implements the new MatchesSanitizer class and isMatchesSanitizer predicate to treat .matches() calls as sanitizers |
| java/ql/test/query-tests/security/CWE-918/SanitizationTests.java | Adds test cases showing sanitization via regex validation with .matches() |
| java/ql/src/change-notes/2025-10-24-request-forgery-matches-sanitizer.md | Documents the new sanitizer behavior in the change notes |
| misc/scripts/create-change-note.py | Updates comment to reference change categories documentation |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| } | ||
|
|
||
| /** | ||
| * A qualifier in a call to a `.matches()` method that is a sanitizer for URL redirects. |
Copilot
AI
Oct 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation mentions 'URL redirects' but this sanitizer is for request forgery (SSRF), not URL redirection. The comment should refer to 'request forgery' or 'SSRF' to match the actual usage context.
| } | ||
|
|
||
| /** | ||
| * A qualifier in a call to `.matches()` that is a sanitizer for URL redirects. |
Copilot
AI
Oct 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation mentions 'URL redirects' but should refer to 'request forgery' or 'SSRF' to accurately describe the sanitizer's purpose in this context.
| } | ||
|
|
||
| private void validate(String s) { | ||
| if (!s.matches("[a-zA-Z0-9/_-]+")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing / here sounds like playing with fire, so let's not do that.
| if (!s.matches("[a-zA-Z0-9/_-]+")) { | |
| if (!s.matches("[a-zA-Z0-9_-]+")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I happily took what Copilot gave me ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment, otherwise LGTM. I checked that at least Go also uses regex as an SSRF sanitizer, so there's established precedence for at least one other language.
| // GOOD: sanitisation by regexp validation | ||
| String safeUri10 = "https://example.com/"; | ||
| String param10 = request.getParameter("uri10"); | ||
| if (param10.matches("[a-zA-Z0-9/_-]+")) { | ||
| safeUri10 = safeUri10 + param10; | ||
| } | ||
| HttpRequest r10 = HttpRequest.newBuilder(new URI(safeUri10)).build(); | ||
| client.send(r10, null); | ||
|
|
||
|
|
||
| String param11 = request.getParameter("uri11"); | ||
| validate(param11); | ||
| String safeUri11 = "https://example.com/" + param11; | ||
| HttpRequest r11 = HttpRequest.newBuilder(new URI(safeUri11)).build(); | ||
| client.send(r11, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait a minute, are these two test cases even testing the regex sanitizer? If I've understood the existing sanitizers correctly, then the concatenation with a fixed url ending in / like "https://example.com/" + param11 is itself a sanitizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two test cases need to replace the fixed-string url "https://example.com/" with something that isn't analyzable as a potential url-prefix by the StringPrefixes library. Maybe just get the string from a collection field that we pretend is initialized or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch; the first test is actually ok, since safeUri10 is not a constant, but I'll update that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have verified that the updated tests both result in alerts when the sanitizer is disabled.
|
DCA shows that this alert in |
Webgoat is somewhat silly in terms of testing QL - in this case this is the exact spot that the WebGoat SSRF exercise is supposed to succeed - i.e. the ssrf vulnerability is exploited. But of course WebGoat is written to still be in control of the exercise, so SSRF is only allowed to target one very specific website, and hence from our point of view the input is actually sanitized. So yeah. |
We don't actually check the regexp, but assume that it does proper sanitization.